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METHOD AND APPARATUS FOR USING CLUSTERING 
METHOD TO ANALYZE SEMICONDUCTOR DEVICES 

BACKGROUND 

TECHNICAL FffiLD 

5 The present invention relates generally to semiconductor technology and more 

specifically to semiconductor research, development, and production management. 

BACKGROUND ART 

Electronic products are used in almost every aspect of life, and the heart of these 
electronic products is the integrated circuit. Integrated circuits are used a wide variety of 

10 products, such as televisions, telephones, and appliances. 

Integrated circuits are made in and on silicon wafers by extremely complex systems 
that require the coordination of hundreds or even thousands of precisely controlled processes 
to produce a finished semiconductor wafer. Each finished semiconductor wafer has hundreds 
to tens of thousands of integrated circuits, each worth hundreds or thousands of dollars. 

15 The ideal would be to have every one of the integrated circuits on a wafer functional 

and within specifications, but because of the sheer numbers of processes and minute 
variations in ttie processes, this rarely occurs. "Yield" is ttie measure of how many "good" 
integrated circuits there are on a wafer divided by the maximum number of possible good 
integrated circuits on the wafer. A 100% yield is extremely difficult to obtain because minor 

20 variations, due to such factors as timing, temperature, and materials, substantially affect a 
process. Further, one process often affects a number of other processes, often in 
unpredictable ways. 

In a manufacturing environment, the primary purpose of experimentation is to increase 
the yield. Experiments are performed in-Une and at the end of the production line with both 
25 production wafers and experimental wafers. However, yield enhancement methodologies in 
the manufacturing environment produce an abundance of very detailed data for a large 
number of wafers on processes subject only to minor variations. Major variations in the 
processes are not possible because of the time and cost of using production equipment and 
production wafers. Setup times for equipment and processing time can range from weeks to 
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months, and processed wafers can each contain hundreds of thousands of dollars worth of 
integrated circuits. 

The learning cycle for the improvement of systems and processes requires coming up 
with an idea, formulating a test(s) of the idea, testing the idea to obtain data, studying the data 
5 to determine the correctness of the idea, and developing new ideas based on the correctness of 
the first idea. The faster the correctness of ideas can be determined, the faster new ideas can 
be developed. Unfortunately, the manufacturing environment provides a slow learning cycle 
because of manufacturing time and cost. 

Recently, the great increase in the complexity of integrated circuit manufacturing 
10 processes and the decrease in time between new product conception and market introduction 
have both created the need for speeding up the leaming cycle. 

This has been accomplished in part by the unique development of the integrated 
circuit research and development environment. In this environment, the leaming cycle has 
been greatly speeded up and innovative techniques have been developed that have been 
15 extrapolated to high volume manufacturing facilities. 

To speed up the leaming cycle, processes are speeded up and major variations are 
made to many processes, but only a few wafers are processed to reduce cost. The research 
and development environment has resulted in the generation of tremendous amounts of data 
and analysis for all the different processes and variations. This, in turn, has required a large 
20 number of engineers to do the analysis. With more data, the answer always has been to hire 
more engineers. 

However, this is not an acceptable solution for major problems. For example, during 
the production of semiconductor devices, in-line defect inspections are conducted to obtain 
defect data about the devices. In-line defects are detected by inspection techniques conducted 

25 between process steps for fabricating the semiconductor devices. (Actual defects are 
determined later using electrical tests after the chip fabrication is completed.) The defect data 
is typically collected by laser scanning, optical, or scanning electron microscope ("SEM"). 
Defects may include a plurality of different events that may have very different respective 
impacts on chip yield. Any irregularities such as structural imperfections, particles, residuals, 

30 or embedded foreign material are considered as defects. 

The inspection techniques often result in a total count of the number of defects 
detected in each process step, but not an abundance of in-depth or specific defect data. Total 
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count information alone is not adequate for assigning good yield loss projections to defects 
detected at each particular process step. 

It is common practice in the semiconductor industry, however, to inspect wafers at 
various times by employing inspection tools during production. The better the inspections, 
5 the better the data that can potentially shorten yield learning cycles by making it possible to 
react quickly to process problems. The process engineer therefore needs to know the number 
of defects per wafer, the x-y coordinates of each defect, and a set of parameters (different for 
different tools) specific for each particular defect. To obtain yield impact projections, it is 
then desirable to correlate the actual defect data to actual electrical failures. Such data can be 
10 crucial for maximizing yields of a product. 

Speed is also critical for efficient manufacturing. Reviewing all the inspected defects, 
even using known automated classification, can significantly delay yield learning cycles and 
the subsequent manufacturing process for the semiconductor devices. 

Therefore, a need exists for a method and system for quickly correlating large amounts 
15 of in-line defect data in each defect-inspected wafer with location and known defect and yield 
data in order to suggest corresponding process anomalies associated with such defects so that 
appropriate process adjustments and corrections can be made. 

Solutions to these problems have been long sought but prior developments have not 
taught or suggested any solutions and, thus, solutions to these problems have long eluded 
20 those skilled in the art. 

DISCLOSURE OF THE INVENTION 

The present invention provides a method for analyzing semiconductor device. A 
semiconductor is tested to produce first and second data. A clustering method is applied to 
the first data, creating a clustered first data. The clustered first data is tiien correlated with the 
25 second data to determine analyzed data. This process allows for the handling of large 
amounts of semiconductor testing information and reduces the complexity and time required 
for testing. 

Certain embodiments of the invention have other advantages in addition to or in place 
of those mentioned above. The advantages will become apparent to those skilled in the art 
30 from a reading of the following detailed description when taken with reference to the 
accompanying drawings. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a block diagram of a tester information processing system according to the 
present invention; 

FIG. 2 shows a graph of IV curves representing data generated in the generation block; 
5 FIG. 3 shows a graph of Vt distributions representing data generated in the generation 

block; 

FIG. 4 is a box chart after the application of a clustering method; 

FIG. 5 is a presentation block showing correlated data; and 

FIG. 6 is a system for carrying out an embodiment of the present invention. 

10 DETAILED DESCRIPTION OF THE INVENTION 

Referring now to FIG. 1, therein is shown a block diagram of a tester information 
processing system 100 according to the present invention. The tester information processing 
system 100 is the result of the discovery that at times a single fundamental block can solve 
the problems presented but often there are four fundamental blocks to solving the problems 
15 presented. 

The four fundamental blocks are a generation block 101, an extraction block 102, an 
analysis block 103, and a presentation block 104. Each of the blocks can stand independently 
in the tester information processing system 100, and within these blocks are various 
commercially available techniques, methodologies, processes, circuitry, and approaches as 
20 well as the invention disclosed herein. The four fundamental blocks are discussed in the 
approximate chronology that the blocks are used in the tester information processing system 
100. 

The tester information processing system 100 includes various pieces of commercially 
available production, test, research, and development semiconductor equipment, which 
25 operate on and manipulate information and/or data, which are generically defined herein as 
"information". The tester information processing system 100 receives information from a 
tester 105, which is connected to a system-under-test 106. 

In the integrated circuit field, the tester 105 can be a semiconductor test system for 
testing wafers or die and the system-under-test 106 can be anydiing from a complete wafer 
30 down to an element of an individual semiconductor device on a die. 
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In the generation block 101, basic information is generated looking at new and old 
products, new and old processes, product and process problems, unexpected or unpredictable 
results and variations, etc. Generation of the information may use the tester 105 itself, 
conventional test information, a personal computer, etc. It has been discovered that it is 
5 possible to program the tester 105 to automatically collect current versus voltage or threshold 
voltage distribution for all die on a wafer. It may also require new equipment and/or 
methods, which are described herein when required. 

In the extraction block 102, usable information is extracted from the generated 
information from the generation block 101. Essentially, the generated information is 
10 translated into more useful forms; e.g., broken apart so it can be reassembled in different 
forms to show different inter-relationships. 

For example, most testing equipment provides raw data in massive test files. 
Sometimes, millions of measurements provide millions of pieces of information, which must 
be digested and understood. The test files seldom have a user-friendly tabular output of 
15 parameter and value. Even where somewhat user-friendly outputs are provided, there are 
problems with the proper schema for storing the usable data and for formatting the data for 
subsequent analysis. 

Extraction of the usable information may also require new equipment and/or methods. 
Sometimes, extraction includes storing the information for long duration experiments or for 

20 different experiments, which are described herein when required. 

In the analysis block 103, the usable information from the extraction block 102 is 
analyzed. In some cases, this can include mapping, commonality, or correlation of the test 
data to physical locations in the system-under-test 106. Unlike previous systems where a few 
experiments were performed and/or a relatively few data points were determined, the sheer 

25 volume of experiments and data precludes easy analysis of trends in the data or the ability to 
make predictions based on the data. Analysis of the extracted information may also require 
new equipment and/or methods, which are described herein when required. 

In the presentation block 104, the analyzed information from the analysis block 103 is 
manipulated and presented in a comprehensible form to assist others in understanding the 

30 significance of the analyzed data. The huge amount of analyzed information often leads to 
esoteric presentations, which are not useful per se, misleading, or boring. Proper presentation 
often is an essential ingredient for making informed decisions on how to proceed to achieve 
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yield and processing improvements. In some cases, problems cannot even be recognized 
unless the information is presented in an easily understood and digested form, and this often 
requires new methods of presentation, which are described herein when required. 

In the production of semiconductor devices, each process step must be developed and 
5 stabilized as quickly as possible. It is therefore essential to perform failure analysis rapidly, 
efficiently, and effectively, so that the results of the failure analysis can facilitate quick repair 
of the process defect that caused the failure. 

Referring now to FIG. 2, therein is shown a graph of current- voltage (TV) curves 200. 
These curves are representative of the raw test data that can be generated in the generation 
10 block 101, and are one form the usable information received by the analysis block 103 might 
take. The test data can have clusters of similar IV curves such as clusters A 202, B 204, and 
C 206. As this test data can be generated automatically, massive amounts of data can be 
generated for a single system-under-test 106. 

Referring now to FIG. 3, therein is shown a graph of threshold voltage (Vt) 
15 distributions 300. These distributions are representative of the raw test data that can be 
generated in the generation block 101, and are another form the usable information received 
by the analysis block 103 might take. The test data can have clusters of similar Vt 
distributions such as clusters D 302, E 304, and F 306. As this test data can be generated 
automatically, massive amounts of data can be generated for a single system-under-test 106. 
20 Referring now to FIG. 4, therein is shown a box chart 400 after the application of a 

clustering method to die usable information received by the analysis block 103. The 
application of a clustering method occurs in the analysis block 103 and makes utilizing the 
vast amounts of data more manageable. The box chart 400 is an example of a possible result 
of the clustering method including boxes G 402, H 404, and 1 406, which represent ranges of 
25 clustered, data with similar characteristics such as the clusters A 202, B 204, and C 206 in 
FIG. 2, or D 302, E 304, and F 306 in FIG. 3. These boxes could be viewed as three separate 
regions on the system-under-test 106 which have been exposed to three slightly different 
processes, or process-splits. 

The clustering method can be K-means clustering or Spatial Signature Analysis 
30 (SSA). K-means clustering is a nonhierarchical clustering method, which repeatedly 
examines data to create and refine clusters in order to maximize the significance of intergroup 
distance. K-means clustering can be used to create a classification that can be used for 
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subsequent analysis such as wafer mapping, commonality, and correlation. SSA has been 
developed only to analyze group and wafer patterns. Treating the data, such as the IV curves 
200 and the Vt distributions 300, as a "wafer", SSA can be used to allow subsequent analysis 
such as wafer mapping, commonality, and correlation. 
5 Referring now to FIG. 5, therein is shown a presentation block 104 wherein data from 

the analysis block 103 has been correlated with the corresponding areas of the system-under- 
test 106, in this case a semiconductor wafer, to make a wafer map 500 with the analyzed data. 
Areas G 502, H 504, and I 506 are areas with characteristics similar to the clusters of data in 
boxes G 402, H 404, and I 406 in FIG. 4. While the characteristics of some areas might 

10 indicate defects, characteristics of other areas might indicate areas with desired behavior. 

While the depictions in FIGs. 2, 3, 4, and 5 are depicted generally as clearly defined, it 
will be understood that the variety of characteristics in practice is as diverse as the production 
and process defects that lead to giving wafer die their complex characteristics and defects. As 
an example, asymmetrical defect clustering (not shown) toward one side of a wafer might 

15 indicate an uneven exposure to etchant. Such uneven exposure to etchant might occur when a 
wafer is immersed into an etchant in a manner that exposes one side of the wafer to the 
etchant noticeably longer than the opposite side. 

A method according to the present invention thus analyzes semiconductor test data, 
such as wafer defect data. This analyzed data can then be correlated with physical data to 

20 identify defect areas, predict the causes, and suggest solutions. Yield learning cycles are 
therefore accelerated by the present invention, defect causes are more quickly identified, and 
corrective yield impact projections are promptly and accurately generated. The corresponding 
manufacturing process problems are then corrected and optimized more quickly, and process 
yields are correspondingly improved more rapidly. 

25 It will be readily understood, based upon this disclosure, that the same methodology 

and equipment of the present invention may also be used to analyze other semiconductor test 
data types that are currently treated as collections of individual data points in addition to Vt 
distributions and IV curves. 

It will also be readily understood, based upon this disclosure, that other forms of 

30 cluster analysis in addition to K-means clustering and SSA may be used to analyze 
semiconductor test data. The result is much faster and more accurate analyses that 
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advantageously avoid current limitations such as manual classification, intensive 
computation, and so forth. 

Referring now to FIG. 6, therein is shown a system 600 in accordance with an 
embodiment of the present invention in which the blocks are steps in a method or circuitry for 
carrying out the steps. The system 600 includes: testing a semiconductor device to produce 
first data and second data in a block 602; applying a clustering method to the first data to 
create a clustered first data in a block 604; and correlating the clustered first data with the 
second data to determine analyzed data in a block 606. 

While the invention has been described in conjunction with a specific best mode, it is 
to be understood that many alternatives, modifications, and variations will be apparent to 
those skilled in the art in light of the aforegoing description. Accordingly, it is intended to 
embrace all such alternatives, modifications, and variations that fall within the spirit and 
scope of the included claims. All matters hither-to-fore set forth or shown in the 
accompanying drawings are to be interpreted in an illustrative and non-limiting sense. 
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